eptides will not maintain any amino acid composition pattern or

t a random distribution. In other words, each of 20 amino acids

ame probability to appear at each residue of a non-cleaved peptide.

e, it is expected that the homology scores of cleaved peptides and

ology scores of non-cleaved peptides should show different

in theory.

d on the above analysis, BBFNN is designed in the following

ppose there are K bio-bases. A non-numerical peptide is then

to a K-dimensional space, i.e., ࣜሺܠ, ܛ, ࣜሺܠ, ܛ, , and

. A linear function is generated for combining K bio-basis

through the weighting parameters ݓ,

෍ݓࣜሺܠ, ܛ

௞ୀଵ

(3.42)

xpected that this linear combination of the bio-basis functions

a bimodal distribution if weights (ݓ) have been well-estimated.

modal distribution is expected to fit the classification of peptides

of peptide status (ݕ). A linear classifier can then be built in this

sional bio-basis function space,

ݕොൌ෍ݓࣜሺܠ, ܛ

௞ୀଵ

(3.43)

ose X is a collection of peptides and S is a matrix of all peptides

o this K-dimensional bio-basis function space. The matrix S has

or N peptides and K columns for K bio-bases. A target vector of

tion labels is denoted by y,

ܡൌሺݕ, ݕ, ⋯, ݕ

(3.44)

y, a weight vector is represented by w,

ܟൌሺݓ, ݓ, ⋯, ݓ

(3.45)